Skip to content

vendor imara-diff#2506

Merged
Sebastian Thiel (Byron) merged 3 commits intomainfrom
vendor-imara-diff
Apr 13, 2026
Merged

vendor imara-diff#2506
Sebastian Thiel (Byron) merged 3 commits intomainfrom
vendor-imara-diff

Conversation

@Byron
Copy link
Copy Markdown
Member

@Byron Sebastian Thiel (Byron) commented Apr 7, 2026

Take imara-diff in and unify both versions of it, one that has sliders, the other one that doesn't.
See what LLMs can do as well if they see both versions at the same time easily.

Tasks

  • first vendor with separate crates for 0.1 and 0.2, used by all code and it compiles
  • refackiew fix-ci
  • cleanup (size adjustment, file removal)
  • enable fuzz-test to show that timeout is fixed.

Out of Scope

  • port gix-diff and its unified diff writer over to slider-aware imara-diff v0.2
  • let line-stats still use v0.1 and make 0.2 the default
  • merge everything into one crate, reuse shared types, etc.

Copy link
Copy Markdown

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zizmor found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@Byron Sebastian Thiel (Byron) force-pushed the vendor-imara-diff branch 5 times, most recently from d22622d to 6d0fd32 Compare April 13, 2026 08:10
This commit contains the intial code as documented, but also makes all adjustments
needed to make it all pass CI. This essentially means that all backports have to be
done by hand, but I am sure this can be automated.

- Add imara-diff@d2930d174bd4469f4932b4658f4fa505e8e0b655 (0.1) for vendoring
  This is the baseline version, with a DoS issue fixed due to out-of-bound
  computation.
  Drop large JSON performance-test fixtures and the perf-only test that used them.
- Add imara-diff@32d1e45d3df061e6ccba6db7fdce92db29e345d8 (0.2) for vendoring
  It contains a DoS fix for a runaway computation.
  Drop large JSON performance-test fixtures and the perf-only test that used them.
- Make vendored, fixed imara-diff crates part of the workspace
- remove big test-files to reduce noise
- fix CI
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fd49295c5e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

The problem was detected by Google OSS-Fuzz, and occurs because
of window-size problem in imara-diff's preprocessing.

With `gix-imara-diff` the problem is fixed as it's now vendored.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR vendors imara-diff into the workspace by adding two internal crates (gix-imara-diff-01 for v0.1.8 and gix-imara-diff for v0.2.0) and rewires existing crates to depend on the vendored versions to avoid pulling multiple upstream variants.

Changes:

  • Add new workspace crates gix-imara-diff-01 (v0.1.8) and gix-imara-diff (v0.2.0), including tests and fuzzing harnesses.
  • Update gix-diff, gix-merge, and fuzz crates to use the vendored crates via package = ... + path = ....
  • Adjust justfile checks to validate dependency absence/presence using the new crate names.

Reviewed changes

Copilot reviewed 55 out of 57 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
justfile Updates workspace check logic to look for vendored gix-imara-diff* crates in the dependency tree.
gix-merge/fuzz/Cargo.toml Switches fuzz crate to vendored gix-imara-diff-01.
gix-merge/Cargo.toml Switches merge crate to vendored gix-imara-diff-01.
gix-imara-diff/tests/sliders.rs Adds a placeholder/commented test file for slider-related experimentation.
gix-imara-diff/tests/integration/main.rs Adds integration tests for tokenization, unified diff formatting, and word-diff behavior.
gix-imara-diff/tests/fixtures/make_git_diffs.sh Adds a helper script intended to generate fixture diffs from real repos.
gix-imara-diff/src/util.rs Adds shared utilities for common-prefix/suffix, sqrt approximation, and hunk navigation.
gix-imara-diff/src/unified_diff.rs Adds unified diff formatting support (v0.2-style API).
gix-imara-diff/src/tests.rs Adds unit tests targeting Myers split parity and unified diff output expectations.
gix-imara-diff/src/sources.rs Adds token sources (lines/words/byte lines) used for interning/tokenization.
gix-imara-diff/src/slider_heuristic.rs Adds slider heuristics (including indentation-based heuristic) for postprocessing.
gix-imara-diff/src/postprocess.rs Adds postprocessing to slide/merge hunks using heuristics.
gix-imara-diff/src/myers/slice.rs Adds Myers file-slice abstraction for divide-and-conquer.
gix-imara-diff/src/myers/preprocess.rs Adds Myers preprocessing to prune tokens and improve performance.
gix-imara-diff/src/myers/middle_snake.rs Adds middle-snake search implementation used by Myers split.
gix-imara-diff/src/myers.rs Adds the Myers diff implementation (with heuristics/minimal mode).
gix-imara-diff/src/intern.rs Adds token interning + TokenSource/InternedInput infrastructure.
gix-imara-diff/src/histogram/list_pool.rs Adds pooled list storage used by histogram diff bookkeeping.
gix-imara-diff/src/histogram/lcs.rs Adds histogram LCS search logic.
gix-imara-diff/src/histogram.rs Adds histogram diff algorithm and fallback to Myers on repetitive inputs.
gix-imara-diff/README.md Vendors upstream README (benchmarks, stability policy, fuzzing notes).
gix-imara-diff/LICENSE Vendors upstream license text.
gix-imara-diff/fuzz/README.md Adds documentation for running fuzz targets locally.
gix-imara-diff/fuzz/fuzz_targets/unified_diff_printer.rs Adds fuzz target for unified diff formatting invariants.
gix-imara-diff/fuzz/fuzz_targets/postprocess_heuristics.rs Adds fuzz target for postprocessing under different heuristics.
gix-imara-diff/fuzz/fuzz_targets/diff_compute_with.rs Adds fuzz target for the compute_with API over raw token sequences.
gix-imara-diff/fuzz/fuzz_targets/comprehensive_diff.rs Adds fuzz target combining diffing, postprocessing, and word-diff calls.
gix-imara-diff/fuzz/Cargo.toml Defines the vendored fuzz package wiring to gix-imara-diff.
gix-imara-diff/fuzz/.gitignore Ignores fuzz artifacts/corpus/coverage directories.
gix-imara-diff/Cargo.toml Adds vendored crate manifest for v0.2.0 (gix-imara-diff).
gix-imara-diff/.gitignore Ignores vendored crate build + bench data directories.
gix-imara-diff/.gitattributes Normalizes EOL for fixture files.
gix-imara-diff-01/src/util.rs Vendors v0.1 utility helpers used by Myers/histogram.
gix-imara-diff-01/src/unified_diff.rs Vendors v0.1 unified diff sink/builder.
gix-imara-diff-01/src/tests.rs Vendors v0.1 test suite for unified diff output.
gix-imara-diff-01/src/sources.rs Vendors v0.1 token sources (including with/without line terminators).
gix-imara-diff-01/src/sink.rs Vendors v0.1 sink abstraction and token insertion/removal counter.
gix-imara-diff-01/src/myers/slice.rs Vendors v0.1 Myers file-slice abstraction.
gix-imara-diff-01/src/myers/preprocess.rs Vendors v0.1 Myers preprocessing (including prefix/postfix stripping).
gix-imara-diff-01/src/myers/middle_snake.rs Vendors v0.1 middle-snake search implementation.
gix-imara-diff-01/src/myers.rs Vendors v0.1 Myers algorithm wiring + sink processing.
gix-imara-diff-01/src/lib.rs Vendors v0.1 public API and algorithm selection.
gix-imara-diff-01/src/intern.rs Vendors v0.1 interning infrastructure.
gix-imara-diff-01/src/histogram/list_pool.rs Vendors v0.1 histogram list pool implementation.
gix-imara-diff-01/src/histogram/lcs.rs Vendors v0.1 histogram LCS search.
gix-imara-diff-01/src/histogram.rs Vendors v0.1 histogram diff wiring + fallback behavior.
gix-imara-diff-01/README.md Vendors upstream README for v0.1 crate.
gix-imara-diff-01/LICENSE Vendors upstream license text for v0.1 crate.
gix-imara-diff-01/Cargo.toml Adds vendored crate manifest for v0.1.8 (gix-imara-diff-01).
gix-imara-diff-01/.gitignore Ignores vendored v0.1 crate build + lock + bench data.
gix-imara-diff-01/.gitattributes Normalizes EOL for v0.1 fixture files.
gix-diff/Cargo.toml Repoints imara-diff and imara-diff-v2 dependencies to vendored workspace crates.
Cargo.toml Adds vendored crates to workspace members.

Co-authored-by: Sebastian Thiel <sebastian.thiel@icloud.com>
@Byron
Copy link
Copy Markdown
Member Author

Whatever happened here, it's good and shows that 0.2 is the only version we'd want, particularly for the linecount. No need for extra complexity, apparently.

> cargo bench -p gix-diff --features blob-experimental
     Running benches/line_count.rs (target/release/deps/line_count-59627e52fae8c28f)
Gnuplot not found, using plotters backend
imara-diff 0.1          time:   [3.3171 ms 3.3266 ms 3.3363 ms]

imara-diff 0.2          time:   [2.2665 ms 2.2753 ms 2.2853 ms]
Found 9 outliers among 100 measurements (9.00%)

@Byron Sebastian Thiel (Byron) merged commit 8f091d1 into main Apr 13, 2026
32 checks passed
@Byron Sebastian Thiel (Byron) deleted the vendor-imara-diff branch April 13, 2026 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants